Experiment Tracking Interface Specification =================================================== .. epigraph:: "Strava but for training runs." -- RLtools contributor The experiments follow the hierarchical, filesystem-based experiment tracking interface. We follow the `UNIX` philosophy and expose it through the filesystem. This makes it very easy to sort, filter and analyze the experiments with just unix tools like ``find``, ``sort``, ``grep`` and ``jq``. Additionally it also allows easy monitoring through a static website like https://zoo.rl.tools where experiment results are automatically made accessible by our continuous integration. Run Identifier --------------- Each run is characterized by the following path/identifier structure: .. code-block:: none {TIME}/{COMMIT}_{NAME}_{POPULATION}/{CONFIG}/{SEED} e.g. .. code-block:: none 2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000 - **{TIME}**: e.g. ``2024-05-26_06-26-52`` - **TIME**: The time when the experiment was executed. The time is the most significant component to facilitate lexicographical sorting (e.g. ``find . | sort``). On Linux It can e.g. be generated by ``date '+%Y-%m-%d_%H-%M-%S'`` - **{COMMIT}_{NAME}_{POPULATION}**: e.g. ``4f717cb_zoo_algorithm_environment`` - **COMMIT** (e.g. ``4f717cb``: The first 7 hexadecimal places of the commit hash - **NAME** (e.g. ``zoo``): Name of the experiment. The ``return.json`` should be in a consistent per **NAME** - **POPULATION** (e.g. ``algorithm_environment``): The variables that are varied/tested in this experiment (separated by underscores) - **{CONFIG}**: e.g. ``sac_pendulum-v1`` - **CONFIG**: The values of the previously mentioned **POPULATION** variables for a particular experiment configuration, separated by underscores. In combination with the previous population setup, this yields e.g. ``{"algorithm": "sac", "environment": "pendulum-v1"}`` - **{SEED}**: e.g. ``1337`` - **SEED**: The seed for a particular run of the previously described configuration Run Contents ------------ Per Run ~~~~~~~ Each run can contain different files that apply to the whole run, e.g.: .. code-block:: none 2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/return.json 2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/logs.tfevents - **return.json**: The return value of the experiment. The structure and semantics of this file depend on the **{NAME}**. The existence of this file signals that the experiment is finished. - **config.json** (`optional`): Additional configuration - **logs.tfevents** (`optional`): Tensorboard logs - **ui.js**/**ui.esm.js** (`optional`): Render function for the UI (see https://studio.rl.tools for more info). This allows collected trajectories to be rendered Per Step ~~~~~~~~~ Additionally each run can store files at different stages of the run (corresponding to "steps" of the `Loop Interface `_) e.g.: .. code-block:: none 2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/steps/000000000012000/checkpoint.h5 2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/steps/000000000012000/trajectories.json - **steps**: Directory to separate the step contents from the full-run contents - **{STEP}**: e.g. ``000000000012000`` - **checkpoint.h** (`optional`): Checkpoint exported as a C++ header file such that it can be compiled into e.g. the firmware of a microcontroller. - **checkpoint.h5** (`optional`): Checkpoint exported as an HDF5 file. - **trajectories.json**/**trajectories.json.gz** (`optional`): Trajectories from the evaluation loop step. These trajectories contain state, action, reward and termination signal as JSON objects and hence are compatible with the render function from ``ui.js`` - **evaluation_results.json** (`optional`): Statistics (episode length and return) of the evaluation step